Overview

Dataset statistics

Number of variables16
Number of observations475673
Missing cells560206
Missing cells (%)7.4%
Duplicate rows996
Duplicate rows (%)0.2%
Total size in memory288.1 MiB
Average record size in memory635.1 B

Variable types

CAT10
NUM6

Warnings

MAKE has constant value "475673" Constant
Dataset has 996 (0.2%) duplicate rows Duplicates
STATE has a high cardinality: 53 distinct values High cardinality
ENGINECYLINDERCOUNT is highly correlated with ENGINEDISPLACEMENTHigh correlation
ENGINEDISPLACEMENT is highly correlated with ENGINECYLINDERCOUNTHigh correlation
TRIM has 14955 (3.1%) missing values Missing
TRANSMISSIONTYPE has 6330 (1.3%) missing values Missing
EXTERIORBASECOLOR has 29420 (6.2%) missing values Missing
INTERIORMATERIAL has 142187 (29.9%) missing values Missing
BODYCABSTYLE has 315340 (66.3%) missing values Missing
ODOMETER has 6952 (1.5%) missing values Missing
LISTPRICE has 44089 (9.3%) missing values Missing

Reproduction

Analysis started2020-11-17 20:42:57.984803
Analysis finished2020-11-17 20:43:32.774111
Duration34.79 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

POSTALCODE
Real number (ℝ≥0)

Distinct10017
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51400.14712
Minimum601
Maximum99929
Zeros0
Zeros (%)0.0%
Memory size3.6 MiB

Quantile statistics

Minimum601
5-th percentile7735
Q130097
median48838
Q375751
95-th percentile95136
Maximum99929
Range99328
Interquartile range (IQR)45654

Descriptive statistics

Standard deviation26831.5795
Coefficient of variation (CV)0.5220136712
Kurtosis-1.05812849
Mean51400.14712
Median Absolute Deviation (MAD)21680
Skewness0.03901453687
Sum2.444966218e+10
Variance719933658.6
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1458012610.3%
 
923358610.2%
 
431257670.2%
 
956617500.2%
 
193826550.1%
 
660486400.1%
 
276166020.1%
 
234525950.1%
 
372045780.1%
 
652035740.1%
 
436165740.1%
 
495125660.1%
 
336125650.1%
 
770345620.1%
 
445035510.1%
 
852975440.1%
 
435375440.1%
 
336195360.1%
 
559015300.1%
 
490805290.1%
 
440355260.1%
 
770745150.1%
 
799365130.1%
 
325055110.1%
 
731145080.1%
 
Other values (9992)46031696.8%
 
ValueCountFrequency (%) 
6016< 0.1%
 
61461< 0.1%
 
7251< 0.1%
 
7921< 0.1%
 
9194< 0.1%
 
9203< 0.1%
 
9246< 0.1%
 
9362< 0.1%
 
9594< 0.1%
 
9603< 0.1%
 
ValueCountFrequency (%) 
999291< 0.1%
 
9983543< 0.1%
 
998012< 0.1%
 
99701219< 0.1%
 
9966924< 0.1%
 
99654141< 0.1%
 
9961130< 0.1%
 
995771< 0.1%
 
9951886< 0.1%
 
9951568< 0.1%
 

STATE
Categorical

HIGH CARDINALITY

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.6 MiB
TX
47230 
CA
 
31711
FL
 
25385
OH
 
25224
PA
 
21867
Other values (48)
324256 
ValueCountFrequency (%) 
TX472309.9%
 
CA317116.7%
 
FL253855.3%
 
OH252245.3%
 
PA218674.6%
 
MI216874.6%
 
IL190574.0%
 
NC165253.5%
 
NY164523.5%
 
GA148183.1%
 
IN138722.9%
 
WI134092.8%
 
VA131302.8%
 
MO127902.7%
 
TN117502.5%
 
MN114962.4%
 
WA96862.0%
 
NJ94302.0%
 
AZ88991.9%
 
CO85971.8%
 
OK82021.7%
 
MA79741.7%
 
IA75701.6%
 
KY75381.6%
 
MD73501.5%
 
Other values (28)8402417.7%
 
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
Histogram of lengths of the category

Length

Max length2
Median length2
Mean length2
Min length2

Overview of Unicode Properties

Unique unicode characters24
Unique unicode categories1 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
A13320414.0%
 
N945239.9%
 
I804658.5%
 
M730817.7%
 
T716387.5%
 
C671607.1%
 
O607276.4%
 
L571716.0%
 
X472305.0%
 
H287593.0%
 
W272552.9%
 
F253852.7%
 
Y249642.6%
 
K230272.4%
 
P219602.3%
 
V202782.1%
 
S195572.1%
 
D172011.8%
 
G148371.6%
 
R108661.1%
 
J94301.0%
 
Z88990.9%
 
E83860.9%
 
U53430.6%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter951346100.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A13320414.0%
 
N945239.9%
 
I804658.5%
 
M730817.7%
 
T716387.5%
 
C671607.1%
 
O607276.4%
 
L571716.0%
 
X472305.0%
 
H287593.0%
 
W272552.9%
 
F253852.7%
 
Y249642.6%
 
K230272.4%
 
P219602.3%
 
V202782.1%
 
S195572.1%
 
D172011.8%
 
G148371.6%
 
R108661.1%
 
J94301.0%
 
Z88990.9%
 
E83860.9%
 
U53430.6%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin951346100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
A13320414.0%
 
N945239.9%
 
I804658.5%
 
M730817.7%
 
T716387.5%
 
C671607.1%
 
O607276.4%
 
L571716.0%
 
X472305.0%
 
H287593.0%
 
W272552.9%
 
F253852.7%
 
Y249642.6%
 
K230272.4%
 
P219602.3%
 
V202782.1%
 
S195572.1%
 
D172011.8%
 
G148371.6%
 
R108661.1%
 
J94301.0%
 
Z88990.9%
 
E83860.9%
 
U53430.6%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII951346100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
A13320414.0%
 
N945239.9%
 
I804658.5%
 
M730817.7%
 
T716387.5%
 
C671607.1%
 
O607276.4%
 
L571716.0%
 
X472305.0%
 
H287593.0%
 
W272552.9%
 
F253852.7%
 
Y249642.6%
 
K230272.4%
 
P219602.3%
 
V202782.1%
 
S195572.1%
 
D172011.8%
 
G148371.6%
 
R108661.1%
 
J94301.0%
 
Z88990.9%
 
E83860.9%
 
U53430.6%
 

MODELYEAR
Real number (ℝ≥0)

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2016.030101
Minimum2013
Maximum2018
Zeros0
Zeros (%)0.0%
Memory size3.6 MiB

Quantile statistics

Minimum2013
5-th percentile2013
Q12015
median2017
Q32017
95-th percentile2018
Maximum2018
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.636203544
Coefficient of variation (CV)0.0008115967831
Kurtosis-0.9421869309
Mean2016.030101
Median Absolute Deviation (MAD)1
Skewness-0.5488263564
Sum958971086
Variance2.677162038
MonotocityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
201714896931.3%
 
20189364119.7%
 
20167209615.2%
 
20145448211.5%
 
20155324311.2%
 
20135324211.2%
 
ValueCountFrequency (%) 
20135324211.2%
 
20145448211.5%
 
20155324311.2%
 
20167209615.2%
 
201714896931.3%
 
20189364119.7%
 
ValueCountFrequency (%) 
20189364119.7%
 
201714896931.3%
 
20167209615.2%
 
20155324311.2%
 
20145448211.5%
 
20135324211.2%
 

MAKE
Categorical

CONSTANT
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.6 MiB
Ford
475673 
ValueCountFrequency (%) 
Ford475673100.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length4
Median length4
Mean length4
Min length4

Overview of Unicode Properties

Unique unicode characters4
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
F47567325.0%
 
o47567325.0%
 
r47567325.0%
 
d47567325.0%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter142701975.0%
 
Uppercase Letter47567325.0%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
F475673100.0%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
o47567333.3%
 
r47567333.3%
 
d47567333.3%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin1902692100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
F47567325.0%
 
o47567325.0%
 
r47567325.0%
 
d47567325.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1902692100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
F47567325.0%
 
o47567325.0%
 
r47567325.0%
 
d47567325.0%
 

MODEL
Categorical

Distinct41
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.6 MiB
F-150
113321 
Escape
73891 
Explorer
56713 
Fusion
43090 
Focus
33894 
Other values (36)
154764 
ValueCountFrequency (%) 
F-15011332123.8%
 
Escape7389115.5%
 
Explorer5671311.9%
 
Fusion430909.1%
 
Focus338947.1%
 
Edge335497.1%
 
Mustang204524.3%
 
F-250SD179343.8%
 
F-350SD108232.3%
 
Fiesta102502.2%
 
Taurus82661.7%
 
Expedition62071.3%
 
Fusion Hybrid60001.3%
 
Flex56721.2%
 
Transit Connect48501.0%
 
Transit-35043320.9%
 
Expedition EL37390.8%
 
EcoSport31640.7%
 
C-Max Hybrid28220.6%
 
Transit-25026520.6%
 
Fusion Energi25330.5%
 
Transit-15022100.5%
 
C-Max Energi20330.4%
 
E-350SD13220.3%
 
F-550SD10330.2%
 
Other values (16)49211.0%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length26
Median length6
Mean length6.422460388
Min length2

Overview of Unicode Properties

Unique unicode characters45
Unique unicode categories5 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
F2458658.0%
 
s2127967.0%
 
e2033996.7%
 
E1899196.2%
 
o1662305.4%
 
-1612845.3%
 
51574625.2%
 
01564215.1%
 
r1542005.0%
 
p1455824.8%
 
a1330584.4%
 
u1227734.0%
 
c1182113.9%
 
11161163.8%
 
i1134193.7%
 
n1123813.7%
 
x791822.6%
 
t666842.2%
 
l640012.1%
 
g585731.9%
 
d536011.8%
 
S359021.2%
 
D324521.1%
 
M263050.9%
 
248770.8%
 
Other values (20)1042983.4%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter182233459.7%
 
Uppercase Letter57721718.9%
 
Decimal Number46927915.4%
 
Dash Punctuation1612845.3%
 
Space Separator248770.8%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
F24586542.6%
 
E18991932.9%
 
S359026.2%
 
D324525.6%
 
M263054.6%
 
T223233.9%
 
C97471.7%
 
H88221.5%
 
L37390.6%
 
P7940.1%
 
I7940.1%
 
U5420.1%
 
R6< 0.1%
 
G5< 0.1%
 
A2< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
s21279611.7%
 
e20339911.2%
 
o1662309.1%
 
r1542008.5%
 
p1455828.0%
 
a1330587.3%
 
u1227736.7%
 
c1182116.5%
 
i1134196.2%
 
n1123816.2%
 
x791824.3%
 
t666843.7%
 
l640013.5%
 
g585733.2%
 
d536012.9%
 
y93640.5%
 
b88220.5%
 
h34< 0.1%
 
m24< 0.1%
 

Most frequent Dash Punctuation characters

ValueCountFrequency (%) 
-161284100.0%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
24877100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
515746233.6%
 
015642133.3%
 
111611624.7%
 
2214554.6%
 
3164773.5%
 
411810.3%
 
691< 0.1%
 
768< 0.1%
 
98< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin239955178.5%
 
Common65544021.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
F24586510.2%
 
s2127968.9%
 
e2033998.5%
 
E1899197.9%
 
o1662306.9%
 
r1542006.4%
 
p1455826.1%
 
a1330585.5%
 
u1227735.1%
 
c1182114.9%
 
i1134194.7%
 
n1123814.7%
 
x791823.3%
 
t666842.8%
 
l640012.7%
 
g585732.4%
 
d536012.2%
 
S359021.5%
 
D324521.4%
 
M263051.1%
 
T223230.9%
 
C97470.4%
 
y93640.4%
 
H88220.4%
 
b88220.4%
 
Other values (9)59400.2%
 

Most frequent Common characters

ValueCountFrequency (%) 
-16128424.6%
 
515746224.0%
 
015642123.9%
 
111611617.7%
 
248773.8%
 
2214553.3%
 
3164772.5%
 
411810.2%
 
691< 0.1%
 
768< 0.1%
 
98< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII3054991100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
F2458658.0%
 
s2127967.0%
 
e2033996.7%
 
E1899196.2%
 
o1662305.4%
 
-1612845.3%
 
51574625.2%
 
01564215.1%
 
r1542005.0%
 
p1455824.8%
 
a1330584.4%
 
u1227734.0%
 
c1182113.9%
 
11161163.8%
 
i1134193.7%
 
n1123813.7%
 
x791822.6%
 
t666842.2%
 
l640012.1%
 
g585731.9%
 
d536011.8%
 
S359021.2%
 
D324521.1%
 
M263050.9%
 
248770.8%
 
Other values (20)1042983.4%
 

TRIM
Categorical

MISSING

Distinct39
Distinct (%)< 0.1%
Missing14955
Missing (%)3.1%
Memory size3.6 MiB
SE
123488 
XLT
101190 
Titanium
36849 
SEL
33832 
XL
32738 
Other values (34)
132621 
ValueCountFrequency (%) 
SE12348826.0%
 
XLT10119021.3%
 
Titanium368497.7%
 
SEL338327.1%
 
XL327386.9%
 
Lariat235184.9%
 
Limited231844.9%
 
S139542.9%
 
Base112872.4%
 
Platinum107342.3%
 
Sport103542.2%
 
King Ranch47891.0%
 
GT45000.9%
 
V643490.9%
 
EcoBoost31340.7%
 
GT Premium31280.7%
 
ST29960.6%
 
EcoBoost Premium28780.6%
 
FX428460.6%
 
Raptor23070.5%
 
STX17070.4%
 
SE Luxury14010.3%
 
Shelby GT3508540.2%
 
SVT Raptor8400.2%
 
SHO7840.2%
 
Other values (14)30770.6%
 
(Missing)149553.1%
 
Frequencies of value counts

Unique

Unique2 ?
Unique (%)< 0.1%
Histogram of lengths of the category

Length

Max length27
Median length3
Mean length3.81406975
Min length1

Overview of Unicode Properties

Unique unicode characters42
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
L21596411.9%
 
S19203710.6%
 
i1666299.2%
 
E1653599.1%
 
T1524048.4%
 
X1388767.7%
 
a1294847.1%
 
t1139316.3%
 
n871374.8%
 
m853494.7%
 
u570233.1%
 
r457442.5%
 
e429582.4%
 
o323321.8%
 
d233161.3%
 
s175831.0%
 
B174081.0%
 
P173721.0%
 
151360.8%
 
p135010.7%
 
l124520.7%
 
c114220.6%
 
G87880.5%
 
R84190.5%
 
h58850.3%
 
Other values (17)377412.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter93268551.4%
 
Lowercase Letter85452947.1%
 
Space Separator151360.8%
 
Decimal Number119000.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
i16662919.5%
 
a12948415.2%
 
t11393113.3%
 
n8713710.2%
 
m8534910.0%
 
u570236.7%
 
r457445.4%
 
e429585.0%
 
o323323.8%
 
d233162.7%
 
s175832.1%
 
p135011.6%
 
l124521.5%
 
c114221.3%
 
h58850.7%
 
g47890.6%
 
y24970.3%
 
x14010.2%
 
b10960.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
L21596423.2%
 
S19203720.6%
 
E16535917.7%
 
T15240416.3%
 
X13887614.9%
 
B174081.9%
 
P173721.9%
 
G87880.9%
 
R84190.9%
 
V58060.6%
 
K47890.5%
 
F32080.3%
 
H7850.1%
 
O7840.1%
 
C6200.1%
 
Y66< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
15136100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
6494841.6%
 
4284623.9%
 
0151112.7%
 
511619.8%
 
39638.1%
 
24714.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin178721498.5%
 
Common270361.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
L21596412.1%
 
S19203710.7%
 
i1666299.3%
 
E1653599.3%
 
T1524048.5%
 
X1388767.8%
 
a1294847.2%
 
t1139316.4%
 
n871374.9%
 
m853494.8%
 
u570233.2%
 
r457442.6%
 
e429582.4%
 
o323321.8%
 
d233161.3%
 
s175831.0%
 
B174081.0%
 
P173721.0%
 
p135010.8%
 
l124520.7%
 
c114220.6%
 
G87880.5%
 
R84190.5%
 
h58850.3%
 
V58060.3%
 
Other values (10)200351.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
1513656.0%
 
6494818.3%
 
4284610.5%
 
015115.6%
 
511614.3%
 
39633.6%
 
24711.7%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1814250100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
L21596411.9%
 
S19203710.6%
 
i1666299.2%
 
E1653599.1%
 
T1524048.4%
 
X1388767.7%
 
a1294847.1%
 
t1139316.3%
 
n871374.8%
 
m853494.7%
 
u570233.1%
 
r457442.5%
 
e429582.4%
 
o323321.8%
 
d233161.3%
 
s175831.0%
 
B174081.0%
 
P173721.0%
 
151360.8%
 
p135010.7%
 
l124520.7%
 
c114220.6%
 
G87880.5%
 
R84190.5%
 
h58850.3%
 
Other values (17)377412.1%
 

ENGINEDISPLACEMENT
Real number (ℝ≥0)

HIGH CORRELATION

Distinct22
Distinct (%)< 0.1%
Missing431
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean3.110341047
Minimum1
Maximum6.8
Zeros0
Zeros (%)0.0%
Memory size3.6 MiB

Quantile statistics

Minimum1
5-th percentile1.5
Q12
median2.7
Q33.5
95-th percentile6.2
Maximum6.8
Range5.8
Interquartile range (IQR)1.5

Descriptive statistics

Standard deviation1.390881585
Coefficient of variation (CV)0.447179767
Kurtosis0.3083342061
Mean3.110341047
Median Absolute Deviation (MAD)0.8
Skewness0.9529910523
Sum1478164.7
Variance1.934551583
MonotocityNot monotonic
Histogram with fixed size bins (bins=22)
ValueCountFrequency (%) 
3.513272827.9%
 
29643720.3%
 
1.5447849.4%
 
5431619.1%
 
2.7311336.5%
 
2.5274195.8%
 
1.6265385.6%
 
6.7204444.3%
 
3.7157413.3%
 
2.3135052.8%
 
6.2105072.2%
 
5.434370.7%
 
129190.6%
 
3.317860.4%
 
4.612490.3%
 
5.28560.2%
 
67700.2%
 
6.87520.2%
 
34980.1%
 
3.23370.1%
 
5.82400.1%
 
5.71< 0.1%
 
(Missing)4310.1%
 
ValueCountFrequency (%) 
129190.6%
 
1.5447849.4%
 
1.6265385.6%
 
29643720.3%
 
2.3135052.8%
 
2.5274195.8%
 
2.7311336.5%
 
34980.1%
 
3.23370.1%
 
3.317860.4%
 
ValueCountFrequency (%) 
6.87520.2%
 
6.7204444.3%
 
6.2105072.2%
 
67700.2%
 
5.82400.1%
 
5.71< 0.1%
 
5.434370.7%
 
5.28560.2%
 
5431619.1%
 
4.612490.3%
 

ENGINECYLINDERCOUNT
Real number (ℝ≥0)

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing367
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean5.449142237
Minimum3
Maximum10
Zeros0
Zeros (%)0.0%
Memory size3.6 MiB

Quantile statistics

Minimum3
5-th percentile4
Q14
median6
Q36
95-th percentile8
Maximum10
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.489566789
Coefficient of variation (CV)0.2733580303
Kurtosis-0.9200644719
Mean5.449142237
Median Absolute Deviation (MAD)2
Skewness0.4960406441
Sum2590010
Variance2.218809218
MonotocityNot monotonic
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%) 
420873243.9%
 
618193238.2%
 
88040616.9%
 
329190.6%
 
109800.2%
 
53370.1%
 
(Missing)3670.1%
 
ValueCountFrequency (%) 
329190.6%
 
420873243.9%
 
53370.1%
 
618193238.2%
 
88040616.9%
 
109800.2%
 
ValueCountFrequency (%) 
109800.2%
 
88040616.9%
 
618193238.2%
 
53370.1%
 
420873243.9%
 
329190.6%
 

TRANSMISSIONTYPE
Categorical

MISSING

Distinct3
Distinct (%)< 0.1%
Missing6330
Missing (%)1.3%
Memory size3.6 MiB
Automatic
440410 
Manual
 
15545
CVT
 
13388
ValueCountFrequency (%) 
Automatic44041092.6%
 
Manual155453.3%
 
CVT133882.8%
 
(Missing)63301.3%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length9
Median length9
Mean length8.653242879
Min length3

Overview of Unicode Properties

Unique unicode characters14
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
t88082021.4%
 
a47783011.6%
 
u45595511.1%
 
A44041010.7%
 
o44041010.7%
 
m44041010.7%
 
i44041010.7%
 
c44041010.7%
 
n282050.7%
 
M155450.4%
 
l155450.4%
 
C133880.3%
 
V133880.3%
 
T133880.3%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter361999587.9%
 
Uppercase Letter49611912.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
A44041088.8%
 
M155453.1%
 
C133882.7%
 
V133882.7%
 
T133882.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t88082024.3%
 
a47783013.2%
 
u45595512.6%
 
o44041012.2%
 
m44041012.2%
 
i44041012.2%
 
c44041012.2%
 
n282050.8%
 
l155450.4%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin4116114100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t88082021.4%
 
a47783011.6%
 
u45595511.1%
 
A44041010.7%
 
o44041010.7%
 
m44041010.7%
 
i44041010.7%
 
c44041010.7%
 
n282050.7%
 
M155450.4%
 
l155450.4%
 
C133880.3%
 
V133880.3%
 
T133880.3%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII4116114100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
t88082021.4%
 
a47783011.6%
 
u45595511.1%
 
A44041010.7%
 
o44041010.7%
 
m44041010.7%
 
i44041010.7%
 
c44041010.7%
 
n282050.7%
 
M155450.4%
 
l155450.4%
 
C133880.3%
 
V133880.3%
 
T133880.3%
 

DRIVETRAINTYPE
Categorical

Distinct4
Distinct (%)< 0.1%
Missing85
Missing (%)< 0.1%
Memory size3.6 MiB
4WD
190584 
FWD
180884 
RWD
60159 
AWD
43961 
ValueCountFrequency (%) 
4WD19058440.1%
 
FWD18088438.0%
 
RWD6015912.6%
 
AWD439619.2%
 
(Missing)85< 0.1%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

Overview of Unicode Properties

Unique unicode characters8
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
W47558833.3%
 
D47558833.3%
 
419058413.4%
 
F18088412.7%
 
R601594.2%
 
A439613.1%
 
n170< 0.1%
 
a85< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Uppercase Letter123618086.6%
 
Decimal Number19058413.4%
 
Lowercase Letter255< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
W47558838.5%
 
D47558838.5%
 
F18088414.6%
 
R601594.9%
 
A439613.6%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n17066.7%
 
a8533.3%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
4190584100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin123643586.6%
 
Common19058413.4%
 

Most frequent Latin characters

ValueCountFrequency (%) 
W47558838.5%
 
D47558838.5%
 
F18088414.6%
 
R601594.9%
 
A439613.6%
 
n170< 0.1%
 
a85< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
4190584100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII1427019100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
W47558833.3%
 
D47558833.3%
 
419058413.4%
 
F18088412.7%
 
R601594.2%
 
A439613.1%
 
n170< 0.1%
 
a85< 0.1%
 

EXTERIORBASECOLOR
Categorical

MISSING

Distinct13
Distinct (%)< 0.1%
Missing29420
Missing (%)6.2%
Memory size3.6 MiB
White
116846 
Black
92442 
Gray
68472 
Silver
53888 
Red
52433 
Other values (8)
62172 
ValueCountFrequency (%) 
White11684624.6%
 
Black9244219.4%
 
Gray6847214.4%
 
Silver5388811.3%
 
Red5243311.0%
 
Blue368627.7%
 
Gold99392.1%
 
Brown64661.4%
 
Green32550.7%
 
Orange26950.6%
 
Beige19010.4%
 
Yellow9430.2%
 
Purple111< 0.1%
 
(Missing)294206.2%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length6
Median length5
Mean length4.534676133
Min length3

Overview of Unicode Properties

Unique unicode characters26
Unique unicode categories2 ?
Unique unicode scripts1 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
e27409012.7%
 
l1951289.0%
 
a1930298.9%
 
i1726358.0%
 
B1376716.4%
 
r1348876.3%
 
W1168465.4%
 
h1168465.4%
 
t1168465.4%
 
c924424.3%
 
k924424.3%
 
G816663.8%
 
n712563.3%
 
y684723.2%
 
d623722.9%
 
S538882.5%
 
v538882.5%
 
R524332.4%
 
u369731.7%
 
o173480.8%
 
w74090.3%
 
g45960.2%
 
O26950.1%
 
Y943< 0.1%
 
P111< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter171077079.3%
 
Uppercase Letter44625320.7%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
B13767130.9%
 
W11684626.2%
 
G8166618.3%
 
S5388812.1%
 
R5243311.7%
 
O26950.6%
 
Y9430.2%
 
P111< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
e27409016.0%
 
l19512811.4%
 
a19302911.3%
 
i17263510.1%
 
r1348877.9%
 
h1168466.8%
 
t1168466.8%
 
c924425.4%
 
k924425.4%
 
n712564.2%
 
y684724.0%
 
d623723.6%
 
v538883.1%
 
u369732.2%
 
o173481.0%
 
w74090.4%
 
g45960.3%
 
p111< 0.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin2157023100.0%
 

Most frequent Latin characters

ValueCountFrequency (%) 
e27409012.7%
 
l1951289.0%
 
a1930298.9%
 
i1726358.0%
 
B1376716.4%
 
r1348876.3%
 
W1168465.4%
 
h1168465.4%
 
t1168465.4%
 
c924424.3%
 
k924424.3%
 
G816663.8%
 
n712563.3%
 
y684723.2%
 
d623722.9%
 
S538882.5%
 
v538882.5%
 
R524332.4%
 
u369731.7%
 
o173480.8%
 
w74090.3%
 
g45960.2%
 
O26950.1%
 
Y943< 0.1%
 
P111< 0.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2157023100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
e27409012.7%
 
l1951289.0%
 
a1930298.9%
 
i1726358.0%
 
B1376716.4%
 
r1348876.3%
 
W1168465.4%
 
h1168465.4%
 
t1168465.4%
 
c924424.3%
 
k924424.3%
 
G816663.8%
 
n712563.3%
 
y684723.2%
 
d623722.9%
 
S538882.5%
 
v538882.5%
 
R524332.4%
 
u369731.7%
 
o173480.8%
 
w74090.3%
 
g45960.2%
 
O26950.1%
 
Y943< 0.1%
 
P111< 0.1%
 

INTERIORMATERIAL
Categorical

MISSING

Distinct7
Distinct (%)< 0.1%
Missing142187
Missing (%)29.9%
Memory size3.6 MiB
Cloth
184114 
Leather
135521 
Vinyl
 
12633
Artificial Leather
 
1202
cloth
 
14
Other values (2)
 
2
ValueCountFrequency (%) 
Cloth18411438.7%
 
Leather13552128.5%
 
Vinyl126332.7%
 
Artificial Leather12020.3%
 
cloth14< 0.1%
 
436251< 0.1%
 
87251< 0.1%
 
(Missing)14218729.9%
 
Frequencies of value counts

Unique

Unique2 ?
Unique (%)< 0.1%
Histogram of lengths of the category

Length

Max length18
Median length5
Mean length5.004820538
Min length3

Overview of Unicode Properties

Unique unicode characters24
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
t32205313.5%
 
h32085113.5%
 
n29700712.5%
 
a28011211.8%
 
e27344611.5%
 
l1979638.3%
 
o1841287.7%
 
C1841147.7%
 
r1379255.8%
 
L1367235.7%
 
i162390.7%
 
V126330.5%
 
y126330.5%
 
c12160.1%
 
A12020.1%
 
f12020.1%
 
12020.1%
 
22< 0.1%
 
52< 0.1%
 
81< 0.1%
 
71< 0.1%
 
41< 0.1%
 
31< 0.1%
 
61< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter204477585.9%
 
Uppercase Letter33467214.1%
 
Space Separator12020.1%
 
Decimal Number9< 0.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C18411455.0%
 
L13672340.9%
 
V126333.8%
 
A12020.4%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
t32205315.8%
 
h32085115.7%
 
n29700714.5%
 
a28011213.7%
 
e27344613.4%
 
l1979639.7%
 
o1841289.0%
 
r1379256.7%
 
i162390.8%
 
y126330.6%
 
c12160.1%
 
f12020.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
1202100.0%
 

Most frequent Decimal Number characters

ValueCountFrequency (%) 
2222.2%
 
5222.2%
 
8111.1%
 
7111.1%
 
4111.1%
 
3111.1%
 
6111.1%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin237944799.9%
 
Common12110.1%
 

Most frequent Latin characters

ValueCountFrequency (%) 
t32205313.5%
 
h32085113.5%
 
n29700712.5%
 
a28011211.8%
 
e27344611.5%
 
l1979638.3%
 
o1841287.7%
 
C1841147.7%
 
r1379255.8%
 
L1367235.7%
 
i162390.7%
 
V126330.5%
 
y126330.5%
 
c12160.1%
 
A12020.1%
 
f12020.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
120299.3%
 
220.2%
 
520.2%
 
810.1%
 
710.1%
 
410.1%
 
310.1%
 
610.1%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2380658100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
t32205313.5%
 
h32085113.5%
 
n29700712.5%
 
a28011211.8%
 
e27344611.5%
 
l1979638.3%
 
o1841287.7%
 
C1841147.7%
 
r1379255.8%
 
L1367235.7%
 
i162390.7%
 
V126330.5%
 
y126330.5%
 
c12160.1%
 
A12020.1%
 
f12020.1%
 
12020.1%
 
22< 0.1%
 
52< 0.1%
 
81< 0.1%
 
71< 0.1%
 
41< 0.1%
 
31< 0.1%
 
61< 0.1%
 

BODYTYPE
Categorical

Distinct12
Distinct (%)< 0.1%
Missing50
Missing (%)< 0.1%
Memory size3.6 MiB
SUV
184475 
Truck
144143 
Sedan
85608 
Hatchback
23796 
Coupe
 
16396
Other values (7)
21205 
ValueCountFrequency (%) 
SUV18447538.8%
 
Truck14414330.3%
 
Sedan8560818.0%
 
Hatchback237965.0%
 
Coupe163963.4%
 
Cargo Van97732.1%
 
Wagon59361.2%
 
Convertible40360.8%
 
Cab/Chassis11990.3%
 
Minivan/Van2430.1%
 
RV17< 0.1%
 
Cutaway Van1< 0.1%
 
(Missing)50< 0.1%
 
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
Histogram of lengths of the category

Length

Max length11
Median length5
Mean length4.575441532
Min length2

Overview of Unicode Properties

Unique unicode characters31
Unique unicode categories4 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
S27008312.4%
 
V1945098.9%
 
c1917358.8%
 
U1844758.5%
 
k1679397.7%
 
a1616197.4%
 
u1605407.4%
 
r1579527.3%
 
T1441436.6%
 
e1100765.1%
 
n1061834.9%
 
d856083.9%
 
o361411.7%
 
C326041.5%
 
b290311.3%
 
t278331.3%
 
h249951.1%
 
H237961.1%
 
p163960.8%
 
g157090.7%
 
97740.4%
 
W59360.3%
 
i57210.3%
 
v42790.2%
 
l40360.2%
 
Other values (6)53010.2%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter130939260.2%
 
Uppercase Letter85580639.3%
 
Space Separator97740.4%
 
Other Punctuation14420.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
S27008331.6%
 
V19450922.7%
 
U18447521.6%
 
T14414316.8%
 
C326043.8%
 
H237962.8%
 
W59360.7%
 
M243< 0.1%
 
R17< 0.1%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
c19173514.6%
 
k16793912.8%
 
a16161912.3%
 
u16054012.3%
 
r15795212.1%
 
e1100768.4%
 
n1061838.1%
 
d856086.5%
 
o361412.8%
 
b290312.2%
 
t278332.1%
 
h249951.9%
 
p163961.3%
 
g157091.2%
 
i57210.4%
 
v42790.3%
 
l40360.3%
 
s35970.3%
 
w1< 0.1%
 
y1< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
9774100.0%
 

Most frequent Other Punctuation characters

ValueCountFrequency (%) 
/1442100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin216519899.5%
 
Common112160.5%
 

Most frequent Latin characters

ValueCountFrequency (%) 
S27008312.5%
 
V1945099.0%
 
c1917358.9%
 
U1844758.5%
 
k1679397.8%
 
a1616197.5%
 
u1605407.4%
 
r1579527.3%
 
T1441436.7%
 
e1100765.1%
 
n1061834.9%
 
d856084.0%
 
o361411.7%
 
C326041.5%
 
b290311.3%
 
t278331.3%
 
h249951.2%
 
H237961.1%
 
p163960.8%
 
g157090.7%
 
W59360.3%
 
i57210.3%
 
v42790.2%
 
l40360.2%
 
s35970.2%
 
Other values (4)262< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
977487.1%
 
/144212.9%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2176414100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
S27008312.4%
 
V1945098.9%
 
c1917358.8%
 
U1844758.5%
 
k1679397.7%
 
a1616197.4%
 
u1605407.4%
 
r1579527.3%
 
T1441436.6%
 
e1100765.1%
 
n1061834.9%
 
d856083.9%
 
o361411.7%
 
C326041.5%
 
b290311.3%
 
t278331.3%
 
h249951.1%
 
H237961.1%
 
p163960.8%
 
g157090.7%
 
97740.4%
 
W59360.3%
 
i57210.3%
 
v42790.2%
 
l40360.2%
 
Other values (6)53010.2%
 

BODYCABSTYLE
Categorical

MISSING

Distinct9
Distinct (%)< 0.1%
Missing315340
Missing (%)66.3%
Memory size3.6 MiB
Crew Cab
116521 
Extended Cab
21459 
Cargo Van
 
9343
Standard Cab
 
7277
Wagon
 
4184
Other values (4)
 
1549
ValueCountFrequency (%) 
Crew Cab11652124.5%
 
Extended Cab214594.5%
 
Cargo Van93432.0%
 
Standard Cab72771.5%
 
Wagon41840.9%
 
Extended Cargo Van5670.1%
 
Passenger Van5660.1%
 
Extended Wagon4140.1%
 
Mega Cab2< 0.1%
 
(Missing)31534066.3%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length18
Median length3
Mean length4.943318204
Min length3

Overview of Unicode Properties

Unique unicode characters20
Unique unicode categories3 ?
Unique unicode scripts2 ?
Unique unicode blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Most occurring characters

ValueCountFrequency (%) 
n67603728.8%
 
a50070521.3%
 
C27169011.6%
 
e1625356.9%
 
1567166.7%
 
b1452596.2%
 
r1342745.7%
 
w1165215.0%
 
d594342.5%
 
t297171.3%
 
E224401.0%
 
x224401.0%
 
g150760.6%
 
o145080.6%
 
V104760.4%
 
S72770.3%
 
W45980.2%
 
s1132< 0.1%
 
P566< 0.1%
 
M2< 0.1%
 

Most occurring categories

ValueCountFrequency (%) 
Lowercase Letter187763879.9%
 
Uppercase Letter31704913.5%
 
Space Separator1567166.7%
 

Most frequent Lowercase Letter characters

ValueCountFrequency (%) 
n67603736.0%
 
a50070526.7%
 
e1625358.7%
 
b1452597.7%
 
r1342747.2%
 
w1165216.2%
 
d594343.2%
 
t297171.6%
 
x224401.2%
 
g150760.8%
 
o145080.8%
 
s11320.1%
 

Most frequent Uppercase Letter characters

ValueCountFrequency (%) 
C27169085.7%
 
E224407.1%
 
V104763.3%
 
S72772.3%
 
W45981.5%
 
P5660.2%
 
M2< 0.1%
 

Most frequent Space Separator characters

ValueCountFrequency (%) 
156716100.0%
 

Most occurring scripts

ValueCountFrequency (%) 
Latin219468793.3%
 
Common1567166.7%
 

Most frequent Latin characters

ValueCountFrequency (%) 
n67603730.8%
 
a50070522.8%
 
C27169012.4%
 
e1625357.4%
 
b1452596.6%
 
r1342746.1%
 
w1165215.3%
 
d594342.7%
 
t297171.4%
 
E224401.0%
 
x224401.0%
 
g150760.7%
 
o145080.7%
 
V104760.5%
 
S72770.3%
 
W45980.2%
 
s11320.1%
 
P566< 0.1%
 
M2< 0.1%
 

Most frequent Common characters

ValueCountFrequency (%) 
156716100.0%
 

Most occurring blocks

ValueCountFrequency (%) 
ASCII2351403100.0%
 

Most frequent ASCII characters

ValueCountFrequency (%) 
n67603728.8%
 
a50070521.3%
 
C27169011.6%
 
e1625356.9%
 
1567166.7%
 
b1452596.2%
 
r1342745.7%
 
w1165215.0%
 
d594342.5%
 
t297171.3%
 
E224401.0%
 
x224401.0%
 
g150760.6%
 
o145080.6%
 
V104760.4%
 
S72770.3%
 
W45980.2%
 
s1132< 0.1%
 
P566< 0.1%
 
M2< 0.1%
 

ODOMETER
Real number (ℝ≥0)

MISSING

Distinct136362
Distinct (%)29.1%
Missing6952
Missing (%)1.5%
Infinite0
Infinite (%)0.0%
Mean60952.48376
Minimum1
Maximum394967
Zeros0
Zeros (%)0.0%
Memory size3.6 MiB

Quantile statistics

Minimum1
5-th percentile14004
Q131490
median51890
Q383964
95-th percentile133539
Maximum394967
Range394966
Interquartile range (IQR)52474

Descriptive statistics

Standard deviation38840.94138
Coefficient of variation (CV)0.6372331197
Kurtosis1.958794776
Mean60952.48376
Median Absolute Deviation (MAD)24110
Skewness1.135226545
Sum2.856970914e+10
Variance1508618728
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
56960.1%
 
105130.1%
 
13070.1%
 
133050.1%
 
2163< 0.1%
 
6141< 0.1%
 
3140< 0.1%
 
798< 0.1%
 
6000092< 0.1%
 
8000082< 0.1%
 
3800081< 0.1%
 
6500079< 0.1%
 
4200079< 0.1%
 
3000078< 0.1%
 
876< 0.1%
 
5500074< 0.1%
 
9800073< 0.1%
 
3500073< 0.1%
 
6100072< 0.1%
 
5400071< 0.1%
 
469< 0.1%
 
5600068< 0.1%
 
2000067< 0.1%
 
2500067< 0.1%
 
7200067< 0.1%
 
Other values (136337)46509097.8%
 
(Missing)69521.5%
 
ValueCountFrequency (%) 
13070.1%
 
2163< 0.1%
 
3140< 0.1%
 
469< 0.1%
 
56960.1%
 
6141< 0.1%
 
798< 0.1%
 
876< 0.1%
 
951< 0.1%
 
105130.1%
 
ValueCountFrequency (%) 
3949671< 0.1%
 
3913981< 0.1%
 
3912021< 0.1%
 
3907001< 0.1%
 
3892811< 0.1%
 
3866201< 0.1%
 
3860781< 0.1%
 
3856121< 0.1%
 
3853141< 0.1%
 
3848531< 0.1%
 

LISTPRICE
Real number (ℝ≥0)

MISSING

Distinct38717
Distinct (%)9.0%
Missing44089
Missing (%)9.3%
Infinite0
Infinite (%)0.0%
Mean22849.67779
Minimum781
Maximum1690400
Zeros0
Zeros (%)0.0%
Memory size3.6 MiB

Quantile statistics

Minimum781
5-th percentile8995
Q114495
median19943
Q329700
95-th percentile43995
Maximum1690400
Range1689619
Interquartile range (IQR)15205

Descriptive statistics

Standard deviation12258.94317
Coefficient of variation (CV)0.5365039843
Kurtosis1571.421816
Mean22849.67779
Median Absolute Deviation (MAD)6949
Skewness15.05906768
Sum9861555338
Variance150281687.7
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1499526320.6%
 
1599526120.5%
 
1699525090.5%
 
1399524840.5%
 
1799524040.5%
 
1299523610.5%
 
1899522250.5%
 
1999522230.5%
 
1199521470.5%
 
1099520760.4%
 
999520500.4%
 
1599816720.4%
 
899516390.3%
 
1499815490.3%
 
2199515430.3%
 
1699814830.3%
 
2999514640.3%
 
2499514320.3%
 
2299513640.3%
 
2399513310.3%
 
2099512470.3%
 
799512320.3%
 
2599511930.3%
 
1799811870.2%
 
2699511750.2%
 
Other values (38692)38635081.2%
 
(Missing)440899.3%
 
ValueCountFrequency (%) 
7811< 0.1%
 
9711< 0.1%
 
11002< 0.1%
 
12001< 0.1%
 
15001< 0.1%
 
17001< 0.1%
 
18002< 0.1%
 
19002< 0.1%
 
20001< 0.1%
 
21001< 0.1%
 
ValueCountFrequency (%) 
16904001< 0.1%
 
14000001< 0.1%
 
10950001< 0.1%
 
9998002< 0.1%
 
9789001< 0.1%
 
2190001< 0.1%
 
2135761< 0.1%
 
1300201< 0.1%
 
1300001< 0.1%
 
1293601< 0.1%
 

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

POSTALCODESTATEMODELYEARMAKEMODELTRIMENGINEDISPLACEMENTENGINECYLINDERCOUNTTRANSMISSIONTYPEDRIVETRAINTYPEEXTERIORBASECOLORINTERIORMATERIALBODYTYPEBODYCABSTYLEODOMETERLISTPRICE
030655GA2016FordAerostarNaN3.76.0AutomaticRWDWhiteVinylCutaway VanNaN87970.029991.0
146124IN2016FordAerostarNaN3.06.0AutomaticRWDWhiteNaNMinivan/VanNaN109470.025990.0
21453MA2013FordC-Max EnergiSEL2.04.0CVTFWDSilverLeatherHatchbackNaN56000.08900.0
31453MA2013FordC-Max EnergiSEL2.04.0CVTFWDGreenLeatherHatchbackNaN81778.012999.0
41581MA2013FordC-Max EnergiSEL2.04.0CVTFWDGrayLeatherHatchbackNaN79566.010998.0
51720MA2013FordC-Max EnergiSEL2.04.0CVTFWDWhiteLeatherHatchbackNaN36077.010988.0
61906MA2013FordC-Max EnergiSEL2.04.0CVTFWDSilverLeatherHatchbackNaN102319.08795.0
72148MA2013FordC-Max EnergiSEL2.04.0CVTFWDBlackLeatherHatchbackNaN66892.0NaN
82346MA2013FordC-Max EnergiSEL2.04.0CVTFWDSilverNaNHatchbackNaN124630.05898.0
92420MA2013FordC-Max EnergiSEL2.04.0CVTFWDSilverLeatherHatchbackWagon51084.08338.0

Last rows

POSTALCODESTATEMODELYEARMAKEMODELTRIMENGINEDISPLACEMENTENGINECYLINDERCOUNTTRANSMISSIONTYPEDRIVETRAINTYPEEXTERIORBASECOLORINTERIORMATERIALBODYTYPEBODYCABSTYLEODOMETERLISTPRICE
47566360165IL2018FordUtility Police InterceptorBase3.76.0AutomaticAWDWhiteNaNSUVNaN50301.023795.0
47566460622IL2018FordUtility Police InterceptorBase3.76.0AutomaticAWDBlackNaNSUVNaN95779.014995.0
47566560622IL2018FordUtility Police InterceptorBase3.76.0AutomaticAWDWhiteNaNSUVNaN46892.024795.0
47566660647IL2018FordUtility Police InterceptorBase3.76.0AutomaticAWDNaNNaNSUVNaN25632.025500.0
47566784003UT2018FordUtility Police InterceptorBase3.76.0AutomaticAWDBlackClothSUVNaN35902.019889.0
47566884003UT2018FordUtility Police InterceptorBase3.76.0AutomaticAWDBlackClothSUVNaN19220.021727.0
47566997266OR2018FordUtility Police InterceptorBase3.76.0AutomaticAWDBlueNaNSUVNaN43167.016791.0
47567097266OR2018FordUtility Police InterceptorBase3.76.0AutomaticAWDWhiteNaNSUVNaN26559.016991.0
47567197266OR2018FordUtility Police InterceptorBase3.76.0AutomaticAWDWhiteNaNSUVNaN22498.017491.0
47567297266OR2018FordUtility Police InterceptorBase3.76.0AutomaticAWDBlackNaNSUVNaN4601.017791.0

Duplicate rows

Most frequent

POSTALCODESTATEMODELYEARMAKEMODELTRIMENGINEDISPLACEMENTENGINECYLINDERCOUNTTRANSMISSIONTYPEDRIVETRAINTYPEEXTERIORBASECOLORINTERIORMATERIALBODYTYPEBODYCABSTYLEODOMETERLISTPRICEcount
028677NC2018FordTransit-250Base3.76.0AutomaticRWDWhiteVinylCargo VanCargo Van5.034182.04
246526IN2017FordF-650SDBase6.010.0AutomaticRWDWhiteClothTruckStandard Cab2.0107999.03
140601KY2015FordF-250SDXL6.28.0Automatic4WDWhiteVinylTruckStandard Cab163000.024900.02
348118MI2017FordTransit ConnectXL2.54.0AutomaticFWDWhiteVinylCargo VanCargo Van69488.016455.02
455374MN2016FordF-150Lariat2.76.0Automatic4WDBlackLeatherTruckExtended Cab84506.028900.02
560014IL2017FordF-150XLT5.08.0Automatic4WDBrownClothTruckCrew Cab60842.031995.02
664601MO2016FordF-150XLT5.08.0Automatic4WDBlackClothTruckCrew Cab80565.030395.02
768008NE2017FordF-150Lariat5.08.0Automatic4WDWhiteLeatherTruckCrew Cab28213.043823.02
873762OK2018FordF-550SDXL6.010.0Automatic4WDWhiteVinylTruckExtended Cab5.051995.02
985204AZ2018FordTransit-250Base3.76.0AutomaticRWDWhiteVinylCargo VanCargo Van5.034795.02